智能论文笔记

Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load

Gasser Elbanna , Alice Biryukov , Neil Scheidwasser-Clow , Lara Orlandic , Pablo Mainar , Mikolaj Kegler , Pierre Beckmann , Milos Cernak

分类：人工智能 | 机器学习

2022-03-30

作为对威胁或不利条件的神经生理学反应，压力会影响认知，情绪和行为，并在持续暴露的情况下对健康产生有害的影响。由于语音的情感内容固有地由个人的身心状态调节，因此大量的研究专门研究了引起压力的任务负荷的副语言相关性。从历史上看，语音应力分析（VSA）是使用常规数字信号处理（DSP）技术进行的。尽管基于深神网络（DNN）的现代方法发展了现代方法，但由于多种压力源和个体压力感知的差异，准确检测语音压力仍然很困难。为此，我们介绍了一组五个数据集，用于语音中的任务负载检测。在志愿者队列中诱发了认知或身体压力，累积数量超过一百位讲话者，因此收集了声音记录。我们使用数据集设计和评估了一种新型的自我监督音频表示，该音频表示利用了手工制作的功能（基于DSP）的有效性和数据驱动的DNN表示的复杂性。值得注意的是，所提出的方法的表现优于广泛的手工特征集和新型的基于DNN的音频表示方法。

translated by 谷歌翻译

From Single-Visit to Multi-Visit Image-Based Models: Single-Visit Models are Enough to Predict Obstructive Hydronephrosis

Stanley Bryan Z. Hua , Mandy Rickard , John Weaver , Alice Xiang , Daniel Alvarez , Kyla N. Velear , Kunj Sheth , Gregory E. Tasian , Armando J. Lorenzo , Anna Goldenberg

分类：计算机视觉 | 人工智能

2022-12-27

Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.

translated by 谷歌翻译

Enriching Relation Extraction with OpenIE

Alessandro Temperoni , Maria Biryukov , Martin Theobald

分类：自然语言处理 | 机器学习

2022-12-19

Relation extraction (RE) is a sub-discipline of information extraction (IE) which focuses on the prediction of a relational predicate from a natural-language input unit (such as a sentence, a clause, or even a short paragraph consisting of multiple sentences and/or clauses). Together with named-entity recognition (NER) and disambiguation (NED), RE forms the basis for many advanced IE tasks such as knowledge-base (KB) population and verification. In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE by encoding structured information about the sentences' principal units, such as subjects, objects, verbal phrases, and adverbials, into various forms of vectorized (and hence unstructured) representations of the sentences. Our main conjecture is that the decomposition of long and possibly convoluted sentences into multiple smaller clauses via OpenIE even helps to fine-tune context-sensitive language models such as BERT (and its plethora of variants) for RE. Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models compared to existing RE approaches. Our best results reach 92% and 71% of F1 score for KnowledgeNet and FewRel, respectively, proving the effectiveness of our approach on competitive benchmarks.

translated by 谷歌翻译

BigText-QA: Question Answering over a Large-Scale Hybrid Knowledge Graph

Jingjing Xu , Maria Biryukov , Martin Theobald , Vinu Ellampallil Venugopal

分类：自然语言处理 | 人工智能

2022-12-12

Answering complex questions over textual resources remains a challenging problem$\unicode{x2013}$especially when interpreting the fine-grained relationships among multiple entities that occur within a natural-language question or clue. Curated knowledge bases (KBs), such as YAGO, DBpedia, Freebase and Wikidata, have been widely used in this context and gained great acceptance for question-answering (QA) applications in the past decade. While current KBs offer a concise representation of structured knowledge, they lack the variety of formulations and semantic nuances as well as the context of information provided by the natural-language sources. With BigText-QA, we aim to develop an integrated QA system which is able to answer questions based on a more redundant form of a knowledge graph (KG) that organizes both structured and unstructured (i.e., "hybrid") knowledge in a unified graphical representation. BigText-QA thereby is able to combine the best of both worlds$\unicode{x2013}$a canonical set of named entities, mapped to a structured background KB (such as YAGO or Wikidata), as well as an open set of textual clauses providing highly diversified relational paraphrases with rich context information.

translated by 谷歌翻译

Online Estimation of the Koopman Operator Using Fourier Features

Tahiya Salam , Alice Kate Li , M. Ani Hsieh

分类：机器人 | 机器学习

2022-12-03

Transfer operators offer linear representations and global, physically meaningful features of nonlinear dynamical systems. Discovering transfer operators, such as the Koopman operator, require careful crafted dictionaries of observables, acting on states of the dynamical system. This is ad hoc and requires the full dataset for evaluation. In this paper, we offer an optimization scheme to allow joint learning of the observables and Koopman operator with online data. Our results show we are able to reconstruct the evolution and represent the global features of complex dynamical systems.

translated by 谷歌翻译

Understanding Text Classification Data and Models Using Aggregated Input Salience

Sebastian Ebert , Alice Shoshana Jakobovits , Katja Filippova

分类：自然语言处理

2022-11-10

Realizing when a model is right for a wrong reason is not trivial and requires a significant effort by model developers. In some cases, an input salience method, which highlights the most important parts of the input, may reveal problematic reasoning. But scrutinizing highlights over many data instances is tedious and often infeasible. Furthermore, analyzing examples in isolation does not reveal general patterns in the data or in the model's behavior. In this paper we aim to address these issues and go from understanding single examples to understanding entire datasets and models. The methodology we propose is based on aggregated salience maps. Using this methodology we address multiple distinct but common model developer needs by showing how problematic data and model behavior can be identified -- a necessary first step for improving the model.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

PTSD in the Wild: A Video Database for Studying Post-Traumatic Stress Disorder Recognition in Unconstrained Environments

Moctar Abdoul Latif Sawadogo , Furkan Pala , Gurkirat Singh , Imen Selmi , Pauline Puteaux , Alice Othmani

分类：计算机视觉 | 机器学习

2022-09-28

创伤后应激障碍（PTSD）是一种长期衰弱的精神状况，是针对灾难性生活事件（例如军事战斗，性侵犯和自然灾害）而发展的。 PTSD的特征是过去的创伤事件，侵入性思想，噩梦，过度维护和睡眠障碍的闪回，所有这些都会影响一个人的生活，并导致相当大的社会，职业和人际关系障碍。 PTSD的诊断是由医学专业人员使用精神障碍诊断和统计手册（DSM）中定义的PTSD症状的自我评估问卷进行的。在本文中，这是我们第一次收集，注释并为公共发行准备了一个新的视频数据库，用于自动PTSD诊断，在野生数据集中称为PTSD。该数据库在采集条件下表现出“自然”和巨大的差异，面部表达，照明，聚焦，分辨率，年龄，性别，种族，遮挡和背景。除了描述数据集集合的详细信息外，我们还提供了评估野生数据集中PTSD的基于计算机视觉和机器学习方法的基准。此外，我们建议并评估基于深度学习的PTSD检测方法。提出的方法显示出非常有希望的结果。有兴趣的研究人员可以从：http：//www.lissi.fr/ptsd-dataset/下载PTSD-in-wild数据集的副本

translated by 谷歌翻译

Mixed-domain Training Improves Multi-Mission Terrain Segmentation

Grace Vincent , Alice Yepremyan , Jingdao Chen , Edwin Goh

分类：计算机视觉

2022-09-27

行星漫游者任务必须利用基于机器学习的感知来继续发生地球外探索，几乎没有人类的存在。火星地形细分对于漫游车导航和避免危害至关重要，以执行进一步的探索性任务，例如土壤样品收集和寻找有机化合物。当前的火星地形细分模型需要大量标记的数据才能实现可接受的性能，还需要重新培训以在不同域中的部署，即不同的漫游者任务或不同的任务，即地质识别和导航。这项研究提出了一种半监督的学习方法，该方法利用了骨干的无监督对比度预处理，用于对火星表面的多效率语义分割。该模型将通过使用混合域训练套件来确保具有多样性的混合域训练套件，从而扩展到当前的火星分割能力，以便在不同的火星漫游者任务中部署以进行地形导航。使用平均像素精度的评估结果表明，与单个领域训练和监督培训相比，半监督的混合域方法通过达到火星科学实验室的好奇心漫游者的精度为97％，MARS 2020 Perseverance Perseverance Rover提高了精度。。此外，使用召回度量与标准的跨透镜损失相比，使用召回度量的损失功能提供不同的权重方法将对少数族裔或稀有类别的模型提高了30％以上。这些结果可以以数据效率的方式为Rover任务提供未来的多任务和多任务语义细分。

translated by 谷歌翻译

Ranking-Enhanced Unsupervised Sentence Representation Learning

Yeon Seonwoo , Guoyin Wang , Sajal Choudhary , Changmin Seo , Jiwei Li , Xiang Li , Puyang Xu , Sunghyun Park , Alice Oh

分类：自然语言处理

2022-09-09

以前的无监督句子嵌入研究集中在数据增强方法上，例如辍学和基于规则的句子转换方法。但是，这些方法限制了控制句子增强观点的细粒语义。这导致监督信号不足以捕获类似句子的语义相似性。在这项工作中，我们发现使用邻居句子可以捕获相似句子之间更准确的语义相似性。基于这一发现，我们提出了RankEncoder，该发现使用了输入句子和语料库中的句子之间的关系来训练无监督的句子编码器。我们从三个角度评估rankencoder：1）语义文本相似性性能，2）相似句子对的功效，以及3）rankencoder的普遍性。实验结果表明，与先前的最新性能相比，Rankencoder达到80.07 \％Spearman的相关性，绝对提高了1.1％。在类似的句子对上，改进更加显着，改善了1.73％。另外，我们证明了RankEncoder普遍适用于现有的无监督句子编码器。

translated by 谷歌翻译